NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Guidelines for gene and genome assembly nomenclature

https://doi.org/10.1093/genetics/iyaf006

Cannon, Ethalinda KS; Molik, David C; Wright, Adam J; Zhang, Huiting; Honaas, Loren; Chougule, Kapeel; Dyer, Sarah (January 2025, GENETICS)
Harris, T (Ed.)
Abstract The rapid increase in the number of reference-quality genome assemblies presents significant new opportunities for genomic research. However, the absence of standardized naming conventions for genome assemblies and annotations across datasets creates substantial challenges. Inconsistent naming hinders the identification of correct assemblies, complicates the integration of bioinformatics pipelines, and makes it difficult to link assemblies across multiple resources. To address this, we developed a specification for standardizing the naming of reference genome assemblies, to improve consistency across datasets and facilitate interoperability. This specification was created with FAIR (Findable, Accessible, Interoperable, and Reusable) practices in mind, ensuring that reference assemblies are easier to locate, access, and reuse across research communities. Additionally, it has been designed to comply with primary genomic data repositories, including members of the International Nucleotide Sequence Database Collaboration consortium, ensuring compatibility with widely used databases. While initially tailored to the agricultural genomics community, the specification is adaptable for use across different taxa. Widespread adoption of this standardized nomenclature would streamline assembly management, better enable cross-species analyses, and improve the reproducibility of research. It would also enhance natural language processing applications that depend on consistent reference assembly names in genomic literature, promoting greater integration and automated analysis of genomic data. This is a good time to consider more consistent genomic data nomenclature as many research communities and data resources are now finding themselves juggling multiple datasets from multiple data providers.
more » « less
Full Text Available
A haplotype-resolved, chromosome-scale genome for Malus domestica Borkh. ‘WA 38’

https://doi.org/10.1093/g3journal/jkae222

Zhang, Huiting; Ko, Itsuhiro; Eaker, Abigail; Haney, Sabrina; Khuu, Ninh; Ryan, Kara; Appleby, Aaron B; Hoffmann, Brendan; Landis, Henry; Pierro, Kenneth A; et al (September 2024, G3: Genes, Genomes, Genetics)
McIntyre, L (Ed.)
Abstract Genome sequencing for agriculturally important Rosaceous crops has made rapid progress both in completeness and annotation quality. Whole genome sequence and annotation give breeders, researchers, and growers information about cultivar-specific traits such as fruit quality and disease resistance, and inform strategies to enhance postharvest storage. Here we present a haplotype-phased, chromosomal-level genome of Malus domestica, ‘WA 38’, a new apple cultivar released to market in 2017 as Cosmic Crisp®. Using both short and long-read sequencing data with a k-mer-based approach, chromosomes originating from each parent were assembled and segregated. This is the first pome fruit genome fully phased into parental haplotypes in which chromosomes from each parent are identified and separated into their unique, respective haplomes. The two haplome assemblies, ‘Honeycrisp’ originated HapA and ‘Enterprise’ originated HapB, are about 650 Megabases each, and both have a BUSCO score of 98.7% complete. A total of 53,028 and 54,235 genes were annotated from HapA and HapB, respectively. Additionally, we provide genome-scale comparisons to ‘Gala’, ‘Honeycrisp’, and other relevant cultivars highlighting major differences in genome structure and gene family circumscription. This assembly and annotation was done in collaboration with the American Campus Tree Genomes project that includes ‘WA 38’ (Washington State University), ‘d’Anjou’ pear (Auburn University), and many more. To ensure transparency, reproducibility, and applicability for any genome project, our genome assembly and annotation workflow is recorded in detail and shared under a public GitLab repository. All software is containerized, offering a simple implementation of the workflow.
more » « less
Full Text Available
GEMmaker: process massive RNA-seq datasets on heterogeneous computational infrastructure

https://doi.org/10.1186/s12859-022-04629-7

Hadish, John A.; Biggs, Tyler D.; Shealy, Benjamin T.; Bender, M. Reed; McKnight, Coleman B.; Wytko, Connor; Smith, Melissa C.; Feltus, F. Alex; Honaas, Loren; Ficklin, Stephen P. (December 2022, BMC Bioinformatics)

Abstract Background Quantification of gene expression from RNA-seq data is a prerequisite for transcriptome analysis such as differential gene expression analysis and gene co-expression network construction. Individual RNA-seq experiments are larger and combining multiple experiments from sequence repositories can result in datasets with thousands of samples. Processing hundreds to thousands of RNA-seq data can result in challenges related to data management, access to sufficient computational resources, navigation of high-performance computing (HPC) systems, installation of required software dependencies, and reproducibility. Processing of larger and deeper RNA-seq experiments will become more common as sequencing technology matures. Results GEMmaker, is a nf-core compliant, Nextflow workflow, that quantifies gene expression from small to massive RNA-seq datasets. GEMmaker ensures results are highly reproducible through the use of versioned containerized software that can be executed on a single workstation, institutional compute cluster, Kubernetes platform or the cloud. GEMmaker supports popular alignment and quantification tools providing results in raw and normalized formats. GEMmaker is unique in that it can scale to process thousands of local or remote stored samples without exceeding available data storage. Conclusions Workflows that quantify gene expression are not new, and many already address issues of portability, reusability, and scale in terms of access to CPUs. GEMmaker provides these benefits and adds the ability to scale despite low data storage infrastructure. This allows users to process hundreds to thousands of RNA-seq samples even when data storage resources are limited. GEMmaker is freely available and fully documented with step-by-step setup and execution instructions.
more » « less
Full Text Available
A chromosome-scale assembly for ‘d’Anjou’ pear

https://doi.org/10.1093/g3journal/jkae003

Yocca, Alan; Akinyuwa, Mary; Bailey, Nick; Cliver, Brannan; Estes, Harrison; Guillemette, Abigail; Hasannin, Omar; Hutchison, Jennifer; Jenkins, Wren; Kaur, Ishveen; et al (January 2024, G3: Genes, Genomes, Genetics)

Abstract Cultivated pear consists of several Pyrus species with Pyrus communis (European pear) representing a large fraction of worldwide production. As a relatively recently domesticated crop and perennial tree, pear can benefit from genome-assisted breeding. Additionally, comparative genomics within Rosaceae promises greater understanding of evolution within this economically important family. Here, we generate a fully phased chromosome-scale genome assembly of P. communis ‘d’Anjou.’ Using PacBio HiFi and Dovetail Omni-C reads, the genome is resolved into the expected 17 chromosomes, with each haplotype totaling nearly 540 Megabases and a contig N50 of nearly 14 Mb. Both haplotypes are highly syntenic to each other and to the Malus domestica ‘Honeycrisp’ apple genome. Nearly 45,000 genes were annotated in each haplotype, over 90% of which have direct RNA-seq expression evidence. We detect signatures of the known whole-genome duplication shared between apple and pear, and we estimate 57% of d’Anjou genes are retained in duplicate derived from this event. This genome highlights the value of generating phased diploid assemblies for recovering the full allelic complement in highly heterozygous crop species.
more » « less
Transcriptomics of host-specific interactions in natural populations of the parasitic plant purple witchweed ( Striga hermonthica )

https://doi.org/10.1017/wsc.2019.20

Lopez, Lua; Bellis, Emily S.; Wafula, Eric; Hearne, Sarah J.; Honaas, Loren; Ralph, Paula E.; Timko, Michael P.; Unachukwu, Nnanna; dePamphilis, Claude W.; Lasky, Jesse R. (July 2019, Weed Science)

Abstract Host-specific interactions can maintain genetic and phenotypic diversity in parasites that attack multiple host species. Host diversity, in turn, may promote parasite diversity by selection for genetic divergence or plastic responses to host type. The parasitic weed purple witchweed [ Striga hermonthica (Delile) Benth.] causes devastating crop losses in sub-Saharan Africa and is capable of infesting a wide range of grass hosts. Despite some evidence for host adaptation and host-by- Striga genotype interactions, little is known about intraspecific Striga genomic diversity. Here we present a study of transcriptomic diversity in populations of S. hermonthica growing on different hosts (maize [ Zea mays L.] vs. grain sorghum [ Sorghum bicolor (L.) Moench]). We examined gene expression variation and differences in allelic frequency in expressed genes of aboveground tissues from populations in western Nigeria parasitizing each host. Despite low levels of host-based genome-wide differentiation, we identified a set of parasite transcripts specifically associated with each host. Parasite genes in several different functional categories implicated as important in host–parasite interactions differed in expression level and allele on different hosts, including genes involved in nutrient transport, defense and pathogenesis, and plant hormone response. Overall, we provide a set of candidate transcripts that demonstrate host-specific interactions in vegetative tissues of the emerged parasite S. hermonthica . Our study shows how signals of host-specific processes can be detected aboveground, expanding the focus of host–parasite interactions beyond the haustorial connection.
more » « less
Full Text Available
Co-expression networks provide insights into molecular mechanisms of postharvest temperature modulation of apple fruit to reduce superficial scald

https://doi.org/10.1016/j.postharvbio.2018.09.016

Honaas, Loren A.; Hargarten, Heidi L.; Ficklin, Stephen P.; Hadish, John A.; Wafula, Eric; dePamphilis, Claude W.; Mattheis, James P.; Rudell, David R. (March 2019, Postharvest Biology and Technology)

Full Text Available

Search for: All records